
STOP 



Early Journal Content on JSTOR, Free to Anyone in the World 

This article is one of nearly 500,000 scholarly works digitized and made freely available to everyone in 
the world byJSTOR. 

Known as the Early Journal Content, this set of works include research articles, news, letters, and other 
writings published in more than 200 of the oldest leading academic journals. The works date from the 
mid-seventeenth to the early twentieth centuries. 

We encourage people to read and share the Early Journal Content openly and to tell others that this 
resource exists. People may post this content online or redistribute in any way for non-commercial 
purposes. 

Read more about Early Journal Content at http://about.istor.org/participate-istor/individuals/early- 
journal-content . 



JSTOR is a digital library of academic journals, books, and primary source objects. JSTOR helps people 
discover, use, and build upon a wide range of content through a powerful research and teaching 
platform, and preserves this content for future generations. JSTOR is part of ITHAKA, a not-for-profit 
organization that also includes Ithaka S+R and Portico. For more information about JSTOR, please 
contact support@jstor.org. 



288 American Statistical Association. [64 



COMMENT ON WESTERGAARD'S "SCOPE AND 
METHOD OF STATISTICS." 

By G. p. Watkins, New York, N. Y. 



The special viewpoint of Westergaard would by many be 
called mathematical, or he would be more definitely described 
as one who thinks of statistics as an application of the mathe- 
matical theory of probabilities. The description is substan- 
tially correct. Its appropriateness appears in the article under 
discussion and would be borne out by an examination of the 
author's Grundziige der Theorie der Statistik. But such a 
summary statement is seldom adequate. Some statisticians 
are statisticians, and some are mathematicians. Of course 
the implied opposition needs qualification. Indeed the most 
that need be claimed for it is some relative significance. The 
point of present interest is that Westergaard belongs with the 
statisticians. He brings to the aid of statistics certain 
mathematical conceptions, but he has no preference for the 
processes and expressions of algebraic mathematics and uses 
them only to serve his statistical purposes. This is not 
always true of the mathematical statistician. It is a very 
superficial view that supposes, because both mathematics and 
statistics are concerned with numbers, competence in the 
one involves competence in the other. Great familiarity with 
abstract numbers and great satisfaction in dealing with 
them do not promote skill in handling and interpreting 
numbers standing for quantities and quantitative relations 
of concrete things. That a statistician should be "quick at 
figures" and fond of equations and mathematical formulas 
is no more true than the other extreme view that only a 
physician can be competent in vital statistics, only a teacher 
in educational statistics, only a farmer in statistics of agricul- 
ture, etc. There is an element of truth in the second view- 
point. Concrete knowledge of the things to which statistical 
numbers relate is of great importance. The mistakes that 
are most commonly used to discredit statisticians (though they 
are not in fact very harmful) are due to lack of sufficient 



66] Westergaard's "Scope and Method of Statistics." 289 

knowledge of the concrete situations to which the numbers 
dealt with relate. But one can know something about 
farming without having spent years handling a pitchfork, and 
about the railroads without having been a brakeman, etc. 
Moreover, statistics has methods and a technique of its own 
— distinct from those of mathematics — with which professional 
familiarity is as necessary for the statistician as is the pro- 
fessional familiarity of the physician with disease, of the 
farmer with crops, etc. 

The frequency and appropriateness of Westergaard's refer- 
ences to the necessity of keeping the original data in sight is a 
conspicuous characteristic of his article and the one that has 
suggested the foregoing remarks upon the fundamental con- 
creteness of statistical numbers. A death rate, for example, 
should not be thought of as a mere ratio. Its meaning 
depends upon age, sex, race, etc. The "etc." is not a mere 
form. The statistician would like to differentiate and classify 
ad infinitum the deaths and the persons exposed to death, and 
find the correlation, if any, between the various pairs of 
classifications. The quality of statistical analysis is not 
secured by correctness of calculations, nor by appropriate 
application of mathematical theory and skillful manipulation 
of mathematical formulas. Accuracy of observations is more 
important. But statistical accuracy does not mean mathe- 
matical exactness. It means rather the reduction of biased 
error to a minimum, and also — but this may usually be taken 
for granted — the use of large enough aggregates so that 
unbiased errors cancel each other. The more or less selective 
character of observation, statistical and other, and the varied 
composition of concrete numbers — the two factors working 
together — constitute the great source of wrong statistical 
conclusions. The "probable error "in which the mathematical 
statistician is particularly interested is of comparatively little 
significance. 

There seems to be no occasion to add anything to what 
Westergaard has to say of the significance of the probable 
error for statistical analysis. But there is one matter of 
terminology in relation to which I would express dissent. 
The reference to sample statistics as "representative statistics" 



290 American Statistical Association. [66 

seems to me infelicitous. The adjective should refer to 
function and quality rather than to a situation and a numerical 
relation in which t)ie quality is chiefly assumed, the only 
reason for alleging that samples are representative being that 
their selection is presumed to be unbiased. Moreover, we 
need the word in a truer sense for representative numbers, 
where the process by which they are obtained does have 
explicit reference to the function and quality in question. 
I refer to "averages" of all sorts, mathematical and un- 
mathematical, weighted (or true) and unweighted (and 
apparently hap-hazard). Representative numbers, in the 
preferred sense, and relative numbers are the two great 
instruments of statistical analysis. Representative numbers 
are such as provide a condensed substitute, in some one or 
another use of the figures, for the complete aggregate. An 
average represents an aggregate. We try to obtain sample 
numbers that will represent the comprehensive aggregate 
where it is impossible to observe statistically the latter as a 
whole. But the statistical process in this case assumes rather 
than assures the representativeness desired. 

One matter that is quite pertinent to the viewpoint of 
Westergaard's article seems to me not sufiiciently developed. 
A statistical discussion of "probable error" should not stop 
at the application of mathematical theory. It should con- 
sider the margin of error in a broader sense. "Probable 
error" in the mathematical sense deals with accidental and 
unbiased and relatively small errors. But the statistician 
who has had to do with the interpretation of and compilation 
from enumeration and report schedules does not worry about 
such unrepresentativeness of his results as may be due to a 
large "probable" error. Most government statistical offices 
deal, not with samples, but with comprehensive or complete 
aggregates, which are also large enough to get the full benefit 
of the "law of large numbers." Where the "probable error" 
is negligible, however, the margin of error may be large. The 
practically important meaning of accuracy relates to the 
latter situation. The accuracy of a census may be tested by 
repeated sample enumerations, by known or determinable 
coloring or inefficiency in administration, by internal evidences 



67] Westergaard's "Scope and Method of Statistics." 291 

of duplication or omission, by the known functional relation 
of true results to figures obtained by other methods, etc. 
But all such errors are presumably biased and they tend to 
be too large to be left to chance cancellation. The problem is 
not mathematical. 

I am not able to say whether Westergaard's "method of 
expected cases" is his original contribution to statistics or 
not. Perhaps the method is not in principle different enough 
from the "standard calculation," generally known in its 
application to the "correction of the death rate," to make 
the question of originality worth particular consideration. 
The "correction" method is substantially the inverse of the 
method of expected cases. At any rate the development and 
use of the latter by Westergaard affords an example that 
ought to influence other fields besides that of vital statistics. 
I obtained my conception of the importance of the method 
from a reading of his Mortalitdt und Morbilitdt (2d edition, 
1901). Section 18 of chapter I introduces the idea, and the 
frequent apt use of the method throughout is not the least 
reason why the book deservedly takes the highest rank 
among statistical monographs. The method deserves a 
distinct and important place in general statistical theory. 
It is both simple and of broad applicability. I have found 
what is in effect the same method useful in studying street 
railway traffic increases. It should be of interest for the study 
of rates of growth in general. In allowing for the effect of 
annexed data upon per cent, increases the principle under 
whatever name must be taken into consideration. It should 
be noted that the method is not only elementary, as regards 
the mathematics involved, but that the required direction of 
attention is to the concrete. 



